Skip to content

Conversation

@martykulma
Copy link
Contributor

@martykulma martykulma commented Oct 29, 2025

Exposes librdkafka settings as system settings:

  • retry.backoff.ms
  • retry.backoff.max.ms
  • reconnect.backoff.ms
  • reconnect.backoff.max.ms

MZ will provide the default values, most of which are set based on the current defaults described here.

I increased the default value for reconnect.backoff.max.ms to 30s based on local testing. librdkafka would have tried reconnecting a few times quickly before reaching the 30s time. If those didn't work, it's likely that the cause is longer lived. Happy to set this to the librdkafka default of 10s if folks disagree.

Motivation

exposes settings to help alleviate https://github.com/MaterializeInc/database-issues/issues/9801

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@martykulma martykulma force-pushed the maz-expose-kafka-backoff-params branch from fa67352 to 90c92ba Compare October 31, 2025 16:12
@martykulma martykulma changed the title [DNM] - Expose retry and reconnect backoff settings via LD librdkafka: expose retry and reconnect backoff settings Oct 31, 2025
@martykulma martykulma marked this pull request as ready for review November 11, 2025 17:07
@martykulma martykulma requested review from a team as code owners November 11, 2025 17:07
"kafka_retry_backoff",
"kafka_retry_backoff_max",
"kafka_reconnect_backoff",
"kafka_reconnect_backoff_max",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hurt to have them configurable in tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think this prevented configuration in tests. My understanding is that these can still be set via additional_system_parameter_defaults or ALTER SYSTEM SET, if needed.

I put them in this list because they didn't seem to belong in the other camps (per the comment). We set reasonable defaults in the code. Testing variations of retry backoff settings seems unlikely to bear any fruit as far as meaningful bugs.

@martykulma martykulma requested a review from petrosagg November 11, 2025 18:14
Copy link
Contributor

@teskje teskje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no context on these options, but seems straightforward enough!

- set default reconnect backoff max to 30s
@martykulma martykulma force-pushed the maz-expose-kafka-backoff-params branch from 90c92ba to bfcc9a6 Compare November 17, 2025 13:39
@martykulma
Copy link
Contributor Author

I have no context on these options, but seems straightforward enough!

TFTR @teskje!

@martykulma martykulma merged commit b09a105 into MaterializeInc:main Nov 17, 2025
129 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants